comparison process, i.e., the alignment based approach and the
t-free approach.
lignment-based multiple sequence comparison is implemented in
different methods. The most typical methods include the
ve method, the iterative method and the consensus method. The
ve method is a heuristic approach [Feng and Doolittle, 1987;
et al., 2018] and has had a wide range of applications [Loytynoja
dman, 2005; Deorowicz, et al., 2016; Ayad and Pissis, 2017;
et al., 2018; Rubio-Largo, et al., 2018]. Among them, the
ve method is more effective and efficient.
asic principle of the progressive method is to align sequences
ly from the most related pairs to the least related pairs. First, a
alignment is done for all pairs using a fast non-alignment or
t-free sequence comparison method. Based on the these initial
on results, a hierarchical tree is constructed. Afterwards, the most
air of sequences is found and are aligned using a homology
t approach. The pairs of sequences are progressively added to an
t model, which is expressed as a tree.
clustal series packages and the msa package are such
ms. The msa is a Bioconductor package, in which clustal
gnment approaches are implemented. The main support package
is Biostrings. Suppose following sequences are required to
d,
GATGTATGGACCCG
GATGTATGGACCCG
GATGTATCCACCCG
CATGTATGGACCCG
CCAATATCGCTTCT
ollowing R code employed the Smith-Waterman algorithm to
se five sequences pair-wisely. Because each pair of sequences
uired to be aligned, a dual-for loop structure was employed. The
of this code was a similarity matrix score between all pairs of
s.